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Abstract 

Much of extremal graph theory has concentrated either on finding very small subgraphs of a large 
graph (such as Turan's theorem) or on finding spanning subgraphs (such as Dirac's theorem or more 
recently the Posa conjecture). Only a few results give conditions to obtain some intermediate-sized 
subgraph. We contend that this neglect is unjustified. In this paper we investigate minimum-degree 
conditions under which a graph G contains squared paths and squared cycles of arbitrary specified 
lengths. We determine precise thresholds, assuming that the order of G is large. This extends results of 
Fan and Kierstead [J. Combin. Theory Ser. B 63 (1995), 55-64] and of Komlos, Sarkozy, and Szemeredi 
[Random Structures Algorithms 9 (1996), 193-211] concerning containment of a spanning squared paths 
and a spanning squared cycle, respectively. 

1 Introduction 

One of the main programmes of extremal graph theory is the study of conditions on the vertex degrees of 
a host graph G under which a target graph H appears as a subgraph of G (which we denote by H C G). 
Turan's theorem J5U] is a prominent example for results of this type. It asserts that an average degree 
d(G) > ^Ef n forces the copy of a complete graph K r in G (and that this is best possible), where here 
and throughout n is the number of vertices in the host graph G. More generally, the celebrated theorem 
of Erdos and Stone [5] implies that for a fixed graph H the chromatic number x{H) of H determines the 
average degree that is necessary to guarantee a copy of H: If H has chromatic number x(H) = r and 
d{G) > + o(l))n, then if is a subgraph of G. This settles the problem for fixed target graphs, that is, 
graphs that are 'small' compared to the host graph. 

Dirac's theorem [4], another classical result from the area, considers target graphs that are of the same 
order as the host graph, i.e., so-called spanning target graphs. Clearly, any average degree condition on the 
host graph that enforces a connected spanning subgraph must be trivial, and hence the average degree needs 
a suitable replacement in this setting. Here, the minimum degree is a natural candidate, and indeed, Dirac's 
theorem asserts that every graph G with minimum degree 5(G) > has a Hamilton cycle. This implies in 
particular that G has a matching covering 2\n/2\ vertices. 

A 3-chromatic version of this result follows from a theorem by Corradi and Hajnal [3]: the minimum 
degree condition 8(G) > 2[_n/3j implies the existence of a so-called spanning triangle factor in G, that is, 
a collection of ^rz / 3 J vertex disjoint triangles. A well-known conjecture of Posa (see, e.g., [6]) asserts that 
roughly the same minimum degree actually guarantees the existence of a connected super-graph of a spanning 
triangle factor. It states that any graph G with 8(G) > |n contains a spanning squared cycle C„ (where 
the square of a graph, F 2 , is obtained from F by adding edges between all pairs of vertices with distance 2 
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in F). This can be seen as a 3-chromatic analogue of Dirac's theorem which turned out to be much more 
difficult than its 2-chromatic cousin. 

Fan and Kierstead [7] proved an approximate version of Posa's conjecture for large n. In addition they 
determined a sufficient and best possible minimum degree condition for the case that the squared cycle in 
Posa's conjecture is replaced by a squared path P„, i.e., the square of a spanning path P n . 

Theorem 1 (Fan & Kierstead [8]). If G is an n-vertex graph with minimum degree 8(G) > (2n — l)/3, then 
G contains a spanning squared path P% . 

The Posa Conjecture was verified for large values of n by Komlos, Sarkozy, and Szemeredi [TO]. The proof 
in [TU] actually asserts the following stronger result, which guarantees not only spanning squared cycles but 
additionally squared cycles of all lengths between 3 and n that are divisible by 3. 

Theorem 2 (Komlos, Sarkozy & Szemeredi [10j). There exists an integer n such that for all integers 
n > jiq any graph G of order n and minimum degree 8(G) > |n contains all squared cycles C^ e C G with 
3 < U < n. If furthermore K 4 C G, then Cf C G for any 3 < £ < n with I ^ 5. 

For squared cycles Cf with I not divisible by 3 the additional condition K4 C G is necessary because 
these target graphs are not 3-colourable and hence a complete 3-partite graph shows that one cannot hope 
to force Cf unless 6(G) > (2n+ 1)/3. If 8(G) > (2n + l)/3, on the other hand, then Turan's Theorem asserts 
that G contains a copy of K4 and hence Theorem [5] implies Cf C G for any 3 < I < n with I ^ 5. The case 
I = 5 has to be excluded because Cf is the 5-chromatic K^. 

In this paper we address the question what happens between these two extrema of target graphs with 
constant order and spanning target graphs. We are interested in essentially best possible minimum degree 
conditions that enforce subgraphs covering a certain percentage of the host graph. 

Let us start with a simple example. It is easy to see that every graph G with minimum degree 8(G) > 8 for 
< 8 < |n has a matching covering at least 28 vertices (see Proposition ITT1). This gives a linear dependence 
between the forced size of a matching in the host graph and its minimum degree. The (considerably harder) 
result of Corradi and Hajnal [3] mentioned earlier is a variant of this for triangle factors. 

Theorem 3 (Corradi & Hajnal [3]). Let G be a graph on n vertices with minimum degree 8(G) = 8 € 
[in, |n] . Then G contains 28 — n vertex disjoint triangles. 

The main theorem of this paper is a corresponding result mediating between Turan's theorem and Posa's 
conjecture. More precisely, our aim is to provide exact minimum-degree thresholds for the appearance of a 
squared path P% and a squared cycle Cf. 

There are at least two reasonable guesses one might make as to what minimum degree 8(G) = 8 will 
guarantee which length I = £(n, 8) of squared path (or longest squared cycle). On the one hand, the degree 
threshold for a spanning squared path or cycle and for a spanning triangle factor are approximately the 
same. So perhaps this remains true for smaller £: in light of Theorem [3] one might expect that t(n,8) were 
roughly 3(2<5(G) — n). This turns out to be far too optimistic. 

On the other hand, proofs of preceding results dealing with spanning subgraphs essentially combine 
greedy techniques with local changes. They simply start to construct the desired subgraph in (almost) any 
location, and in the event of getting stuck change only a few of the vertices embedded so far; at no time do 
they scrap an entire half-constructed object and start anew. It would not be unreasonable to believe that 
this technique also leads to best possible minimum degree conditions for large but not spanning subgraphs. 
Clearly, in the case of (unsquarcd) paths such a greedy strategy provides a path of length 8(G) + 1. As G 
might be disconnected, however, it cannot guarantee longer paths if 8(G) < n/2. For squared paths the 
following construction shows that with an arbitrary starting location one cannot hope for squared paths on 
more than |(2<J((?) — n) vertices: If G contains disjoint cliques C and C of orders 28 — n and n — 8, and an 
independent set / of order n — 8 such that all vertices of C and C' are connected to all vertices of I but not 
to other vertices of G, then it is not difficult to see that the longest squared path in G starting in an edge 
of C has length | (28(G) — n). This could lead to the idea that £(n, 8) were approximately |(2(5(G) — n). It 
is true that there are squared paths of this length in G — but this lower bound is almost always excessively 
pessimistic. In other words, it turns out that one has to carefully choose the 'region' of G to look for the 
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desired squared path. Since spanning squared paths use all vertices of G this problem does not occur for 
these subgraphs. 

For fixed n both guesses propose a linear dependence between S and the length i(n, 5) of a forced squared 
path (or cycle). As we will see below £(n, 5) as a function of 8 behaves very differently: it is piece- wise linear 
but jumps at certain points. To make this precise we introduce the following functions. Given two positive 
integers n and 5 with S £ n, n — 1], we define r p (n, S) to be the largest integer r such that n — 5 + [S/r\ > S 



and r c (n, 6) to be the largest integer r such that n 
sp(n, 8) := min j | \8/r p (n, 6)~\ + \ , n j 



Observe that sc(n,6) < sp(n, <$) and that for almost every 5, lim 
dependence between sp(n,<5) and 5 is illustrated in Figure [TJ 



5 + \S/r~\ > 6. We then define 
and sc(n, <5) := min j | [~<5/r c (n, #)] 
3 sc(n,S)/n = lim n _ ) . 00 



n). (1) 

sp(n, S)/n. The 




Figure 1: The behaviour of sp(n,<5). 

With this we are ready to formulate our main theorem, which states that sp(n, S) and sc(n, 5) are the 
maximal lengths of squared paths and cycles, respectively, forced in an n-vertex graph G with minimum 
degree 5. More generally, and in accordance with Theorem^ we show that G contains any squared cycle of 
length 3£ < sc(n, 5) with length divisible by 3. We shall show below that these results are tight by explicitly 
constructing extremal graphs G p (n, S) and G c (n, 5) for squared paths and cycles. While the extremal graphs 
of all previously discussed results arc Turan graphs (complete r-partitc graphs where r = 3 in the case of 
squared paths and cycles) the graphs G p (n,d) and G c (n, 8) have a rather different structure. In fact they 
do contain squared cycles Cf for all 3 < £ < sc(n, S) with £ ^ 5. If any one of these 'extra' squared cycles 
with chromatic number 4 is not present in the host graph G, then Theorem [4] guarantees even much longer 
squared cycles C\ t in G. 

Theorem 4. For any v > there exists an integer uq such that for all integers n > uq and S £ [(^ + v)n^ |n] 
the following holds for all n-vertex graphs G with minimum degree 6(G) > S. 
(i) P 2 p(n jjCG and C 2 U C G for every feN with £ < sc(n, S)/3. 

(ii) Either C\ C G for every 3 < £ < sc(n, 6) with £ ^ 5, or C G /or every I < £ < 25 — n — vn. 

The proof of this result relies on Szemeredi's Regularity Lcmmg0 and is presented together with the main 
lemmas in Section [2] Theorem [4] cannot be extended to all values of 6(G) with 6(G) — \n = o(n) because 

1 We refer to 1141 for a survey on applications of the Regularity Lemma on graph embedding problems. 
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for infinitely many values of m there are C4-free graphs F on m vertices with 5(F) > }^\/m (see |17j). Then, 
letting G be the n- vertex graph obtained from F by adding an independent set / on m — \\\pm\ vertices 
and inserting all edges between F and /, it is easy to see that 6(G) > \n + \^/n but G does not contain a 
copy of Cf . 



The following extremal graphs show that the bounds in (i) and (ii) of Theorem [4] are tight (see also 



Figure[2]). For (ii) consider the complete tripartite graph K n s,n—S,2S— n- Clearly, this graph has minimum 
degree S and does not contain Cf for any £ > 3 not divisible by 3 or £ > 3(28 — n). For the first part 
of (i ) let G p (n, 6) be the n- vertex graph obtained from the disjoint union of an independent Y set on n — 5 
vertices and r := r p (n, S) cliques X\, . . . , X r with < • • • < \X r \ < \X\\ + 1 on a total of S vertices, by 
inserting all edges between Y and Xi for each i £ [r]. It is easy to check that 5(G p (n, 5)) = S. Moreover any 
squared path P^ n C G p (n,8) contains vertices from at most one clique As Y is independent and P} n has 
independence number fm/3] we have |_2rrz / 3J < \5/r p (n,8)~\ and thus m < \_h(3[6/r p (n,8)~\+l)\ = sp(n, 5). 
For the second part of (i) we construct the graph G' c (n, S) in the same way as G p (n, S) but with r := r c (n, S) 
and with |X;| = \8/r\ for all i £ [r]. To obtain an n-vertex graph G c (n,6) from G' c (n, S) choose Vi in Xi 
arbitrarily for each i £ [r] and identify all Vi with i < r\5/r~\ — S. Again G c (n, S) has minimum degree <5, any 
squared cycle in G c (n, 5) touches only one of the Xi, and hence m < sc(n, 5). 





G p (n,S) G c (n,S) K n . s 

Figure 2: The extremal graphs, for the special case r p (n, 5) = r c (n, 6) = 4. 



Before closing this introduction let us remark that similar phenomena to those described in Theorem [4] 
are observed with simple paths and cycles. Every graph with minimum degree S contains a path of length 
\n/\n/(8 + 1)J], and this is attained by a vertex disjoint union of cliques. This follows from an easy 
adjustment of the proof of Dirac's theorem. Improving on results of Nikiforov and Schelp [TB] the first 
author proved the following theorem in pQ. The methods used for obtaining this result are quite different 
from those applied in this paper. In particular they do not rely on the Regularity Lemma. 

Theorem 5 (Allen [Tj). Given an integer k > 2 there is no such that whenever n > no and G is an n-vertex 
graph with minimum degree 6 > n/k, the following are true, 
(i) G contains Ct for every even 4 < t < \n/(k — 1)] . 

(ii) if G does not contain a cycle of every length from [2n/S\ — 1 to \n(k — 1)] inclusive then G does 
contain C t for every even 4 < t < 26. 

2 Main lemmas and proof of Theorem [4] 

Our proof of Theorem [4] combines the Stability Method pioneered by Simonovits [18], the Regularity Method 
which pivots around the joint application of Szemeredi's celebrated Regularity Lemma |19| . and the so-called 
Blow-up Lemma by Komlos, Sarkozy and Szemeredi [llj . The combination of these two methods has proved 
useful for a variety of exact embedding results and was applied for example in [10] , However, this well- 
established technique provides only a rather loose framework for proofs of this kind. For our application we 
will embellish this framework with a new concept, the so-called connected triangle components of a graph. 

In this section we explain how we use connected triangle components, the Regularity Method, and the 
Stability Method. We first provide the necessary definitions, formulate our main lemmas (whose proofs 
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are provided in the remaining sections of this paper), and sketch how they work together in the proof of 
Theorem 01 The details of this proof are then presented at the end of the section. 



Notation. For a graph G we write V(G) and E{G) to denote its vertex set and edge set, respectively, and 



set v{G) = \V{G)\, e(G) = \E(G)\ and e(X,Y) = \E(G)n(X x Y)\ for sets X,Y C V(G) . The graph G[X] 



is the subgraph of G induced by X. The neighbourhood of a vertex v in G is denoted by T(v) and T(u,v) is 
the common neighbourhood of u,v G V(G). For an edge uv = e G E(G) we also write T(e) = r(it, u). The 
minimum degree of G is denoted by (5(G) and for two sets X,Y C V(G) we define Sy(X) = min xe x |r(x)ny| 



When we write e 3> s' for two positive real numbers e and e' then we mean that e > e' and that we can 
make e arbitrarily small by choosing e' sufficiently small. 

Connected triangle components and triangle factors. Connected triangle components and connected 
triangle factors are the main protagonists in the proof of Theorem |4] Roughly speaking, in a connected 
triangle component we can start in an arbitrary triangle and reach each other triangle by "walking" through 
a sequence of triangles, and a connected triangle factor is a collection of vertex disjoint triangles each pair 
of which is connected in this way. 

To make this precise, let G = (V, E) be a graph. A triangle walk in G is a sequence of edges ei, . . . , e p in 
G such that ej and e^+i share a triangle of G for all i G [p — 1]. We say that e\ and e p are triangle connected 
in G. A triangle component of G is a maximal set of edges G C E such that every pair of edges in G is 
triangle connected. Observe that this induces an equivalence relation on the edges of G, but a vertex may 
be part of many triangle components. In addition a triangle component does not need to form an induced 
subgraph of G in general. The size |G| of a triangle component C is the number of vertices that are contained 
in some edge of G. 

A triangle factor T in a graph G is a collection of vertex disjoint triangles in G. T is a connected triangle 
factor if all edges of T are in the same triangle component of G. The size of T is the number of vertices 
covered by T. We let CTF(G) denote the maximum size of a connected triangle factor in G. It is not difficult 
to check for example that any connected triangle factor in G p (n, 6) contains only vertices of at most one of 
the cliques Xi (cf. the definition of G p (n,8) below Theorem|4]) and of the independent set Y. Hence 



and the graph G p (n,5) is also extremal with respect to the size of a connected triangle factor for a given 
minimum degree. 

We will usually find that the number of vertices in a triangle component and the size of a maximum 
connected triangle factor in that component are quite different. As we will explain next, for the purposes of 
embedding squared paths and squared cycles, it is the size of a connected triangle factor that is important. 

The Regularity Method. The Regularity Lemma provides a partition of a dense graph that is suitable 
for an application of the Blow-up Lemma, which is an embedding result for large host graphs. In order to 
formulate the versions of these two lemmas that we will use, we first introduce some terminology. 

Let G = (V, E) be a graph and e, d G (0, 1]. For disjoint nonempty U,W C V the density of the pair 
(U, W) is d(U, W) = e{U,V)/\U\\W\. A pair (U,W) with density at least d is {e 1 d)-regular if \d(U',W) - 
d(U, W)\<e for all U' C U and W C W with \U'\ > e\U\ and \W'\ > e\W\. An (e, d)-regular partition of G 
with reduced graph R = (V{R),E(R)) is a partition V iMU . ..UV k of V with \V \ < e\V\, |V*| = \Vj\ for 
all i, j G [k], V(R) = {Vi, . . . ,14}, such that (V i7 Vj) is an (e, d)-regular pair in G whenever ViVj G E(R), 
and for each v G Vi with i G [k] there are at most (e + d)n edges incident to v that are not contained in an 
(e, ci)-regular pair corresponding to an edge of R (this additional requirement is not standard). In this case 
we also say that G has (e,d) -reduced graph R and call the partition classes V, with i G [k] clusters of G. 
Observe that our definition of the reduced graph implies that for T C V(R) we can for example refer to the 
set \JT, which is a subset V(G). 

In this paper we will use the following version of the Regularity Lemma which is an easy corollary of the 
so-called degree version of this lemma (see, e.g., [Ml Theorem 1.10]). 



and S G (X) =S v(G) {X). 




■5 



Lemma 6 (Regularity Lemma). For all e > and tuq there is mi such that every graph G onn > mi vertices 
with 8(G) > 771 has an (e, d) -reduced graph R on m vertices with mo < m < mi and 8(R) > (7 — d — e)m. 

This lemma asserts that the reduced graph R of G "inherits" the high minimum degree of G. We shall 
use this property in order to reduce the original problem of finding a squared path (or cycle) in an n-vertex 
graph with minimum degree 771 to the problem of finding an arbitrary connected triangle factor of a certain 
size in an m-vertex graph R with minimum degree (7 — d — e)m. The new problem is much less particular 
about the required subgraph than the original one and hence easier to attack (see Lemma [5]). 

This kind of reduction is made possible by the Blow-up Lemma. Roughly, this lemma asserts that a 
bounded degree graph H can be embedded into a graph G with reduced graph R if there is a homomorphism 
from H to a small subgraph S of R which does not "overfill" any of the clusters in S. In our setting we apply 
this lemma with S = K3 and conclude that for each triangle t of a connected triangle factor T in R we find a 
squared path in G that almost fills the clusters of G corresponding to t. By using the fact that T is triangle 
connected it is then possible to connect these squared paths into squared paths or cycles of the desired overall 
length. In addition, the Blow-up Lemma allows for some control about the start- and end-vertices of the 
path that is constructed in this way (cf. Lcmma[ |[m)[ . 

The following lemma summarises this embedding technique, which is also implicit in |10j . For complete- 
ness we provide a proof of this lemma in the appendix. 

Lemma 7 (Embedding Lemma). For all d > and m EL € N there exist n EL € N and e > such that the 
following hold for any graph G on n > n EL vertices with (e, d)-reduced graph R' on m > m EL vertices, 
(i) Cl t C G for every £ G N with 2,1 < (1 - d) CTF(R')^. 

(ii) IfK 4 C C for each triangle component C of R' , thenC'j C G for every £ G [3, CTF(i?')^]\{5}. 
Furthermore, let T be a connected triangle factor in a triangle component C of R with K4 C C , let 
UiVi,u%V2 € E(G) be disjoint edges, and suppose that there are (not necessarily disjoint) edges X{Yi, X2Y2 G 
C such that the edge UiVi has at least 2d-^ common neighbours in each cluster Xi and Yj, for i = 1,2. Then 

(Hi) Pg C G for every £ G N with (m + 2) 2 < i < (1 — rf)|T| — , such that Pf starts in ui, vi and ends in 
U2,V2 (in those orders) and at most (e + d)n vertices of Pj are not in (JT. 

The copies of K4 that are required in this lemma play a crucial role when embedding squared cycles 
which are not 3-chromatic. 

The Stability Method. The strategy we just described leaves us with the task of finding a big connected 
triangle factor T in the reduced graph R of G. However, there is one problem with this approach: The 
proportion r of R covered by T is roughly equal to the proportion of G covered by the squared path P that 
we obtain from the Embedding Lemma (Lemma [7]). However, as explained above, the relative minimum 
degree jr = S(R)/\V(R)\ of R is in general slightly smaller than 7^ = 5(G)/\V(G)\, but the extremal graphs 
for squared paths and connected triangle factors are the same. It follows that we cannot expect that r is 
larger than the proportion a maximum squared path covers in a graph with relative minimum degree 7#, 
and hence smaller than the proportion we would like to cover for relative minimum degree jq. 

Consequently we need to be more ambitious and shoot for a bigger connected triangle factor in R than 
we can expect for this minimum degree (cf. Lemma ® RSl")] and p2)| ) . This will of course not always be 
possible, but it will only fail if R (and hence G) is 'very close' to the extremal graph G P (\V(R)\ 1 8(R)) (and 
hence also to G C (\V(R)\,S(R))) in which case we will say that R is near-extremal (cf. Lemma |H |(S3)[ ). 

This approach is called the Stability Method and the following lemma states that it is feasible for our 
purposes. It additionally guarantees copies of K4 as required by the Embedding Lemma. We formulate this 
lemma for graphs G, but use it on the reduced graph R later. Its proof does not rely on the Regularity 
Lemma and is given in Section [3] 

Lemma 8 (Stability Lemma). Given /1 > 0, for any sufficiently small 77 > there exists uq such that if G 
has n > uq vertices and 8(G) = 8 G ((5 + n)n, $ )> then either 

(51) CTF(G) > 3(25 -n), or 

(52) CTF(G) > min(sp(n, 8 + ryn), or 

(53) G has an independent set of size at least n — 8 — llr/n whose removal disconnects G into components 
of size at most y§(2<5 — n). 
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Moreover, in cases [(S2)j artri [(S3)| each triangle component of G contains a K±. 

It remains to handle the graphs with near-extremal reduced graph. For these graphs we have a lot of 
structural information which enables us to show directly that they contain the squared paths and squared 
cycles we desire, as the following lemma documents. The proof of this lemma is provided in Section 3J 

Lemma 9 (Extremal Lemma). For every v > 0, whenever v^$>[i^$>d^$>s>0 there exists N such that the 
following holds. Suppose G is a graph of order n > N and minimum degree 5(G) = 5 > j + vn with (e, d)- 
reduced graph R of order m > m EL . Suppose that V(R) is decomposed into non-empty sets I, B±, B2, ■ ■ ■ , -Bfc, 
where k > 2, I is an independent set of size at least (n — 5 — fxnjm/n, where each set Bi has at most 
19to(2<5 — re)/(10re) vertices, where for any i ^ j there are no edges between Bi and Bj in R, and each 
triangle- component of R contains a copy of K4. 

Then G contains /n and C\ for each £ £ [3, sc(n, 5)] \ {5}. 

It is interesting to notice that, although the two functions sp(n, 6) and sc(n, 6) are different — their jumps 
as 5 increases occur at slightly different values — they are similar enough that the Stability Lemma covers 
them both. We will only need to distinguish between squared paths and squared cycles when we examine 
the near-extremal graphs. 

Proof of Theorem |4j With this we have all ingredients for the proof of our main theorem, which uses the 
Regularity Lemma (Lemma [5]) to construct a regular partition with reduced graph R of the host graph G, 
the Stability Lemma (Lemma [8} to conclude that R either contains a big connected triangle factor or is 
near-extremal, the Embedding Lemma (Lemma [7]) to find long squared paths and cycles in G in the first 
case, and the Extremal Lemma (Lemma |9|) in the second case. 

Proof of Theorem [7} We require our constants to satisfy 

^>/!>77>tf>£>0 

which we choose, given v, in that order; and we choose «o > n EL sufficiently large that we may apply Lemma[B] 
to any n- vertex graph, n > no, to obtain an (e, c?)-regular partition with more than m EL parts (where m EL 
and n EL are as in Lemma [7|). 

Let n > uq and 5 £ (n/2 + vn, re— 1], Let G be any re- vertex graph with 5(G) > 5. We first apply LemmaJB] 
to G to obtain an (e, <i)-reduced graph R on m > ?re, EL vertices. Let 5' = 5(R) > (5/n — d — e)m > m/2 + fin. 
Then we apply Lemma [5] to R. There are three possibilities. 

First, we could find that CTF(i?) > 3(25' — m). In this case by Lemma [7] we are guaranteed that 
for every integer £ with 3£ < (1 — d) CTF (i?)re/m we have C\ t C G. By choice of d and e we have 
3(25' - m)(l - d)n/m > 3(25 - re - vn). Noting that Pf C Cj we have P s 2 p(n f) CG and C\ t C G for each 
integer £ < 25 — re — vn as required. 

Second, we could find that CTF(i?) > min(sp(n, 5 + nn), and that every triangle component of R 
contains a copy of K4. By Lemma [7]we are guaranteed that for every £ € [6, (1 — d) CTF(i?)n/ret] \ {5} we 
have Cj C G. By choice of d and e we have (1 — d) CTF(R)n/m > sp(n, 6), so we have P s 2 p („ { )CG and for 
each integer £ £ [3, sc(re, 5)} \ {5} we have C\ t C G as required. 

Third, we could find that R is near-extremal: R contains an independent set on at least m — 5' — ll^re 
vertices whose removal disconnects R into components of size at most -j§(2<5' — m), and each triangle- 
component of R contains a copy of K4. But now G satisfies the conditions of Lemma [9] It follows that G 
contains P s 2 p( - n s \ and for each £ £ [3, sc(re, 5)] \ {5} the graph G contains C\. □ 

3 Triangle Components and the proof of Lemma [8] 

In this section we provide a proof of our stability result for connected triangle factors, Lemma [8] All 
arguments in this proof are of combinatorial nature. Distinguishing different cases, we analyse the sizes and 
the structure of the triangle components in the graph G under study. Before we give more details about 
our strategy and a sketch of the proof, we introduce some additional definitions and provide a preparatory 
lemma (Lemma ITU)) . 
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Let G be a graph with triangle components C\, . . . , C r . The vertices of a triangle component Ci are all 
vertices v such that some edge uv of G is contained in Ci. The interior int(G) of G is the set of vertices of 
G which are in more than one of the triangle components. For a component C; the interior of Ci, int(Gi), is 
the the set of vertices of Ci which are in int(G). The remaining vertices of Gj are called the exterior <9(C;). 
That is, d(Ci) is formed by the set of vertices of Ci which are in no other triangle component of G. To give 
an example, by definition the graph G p (n, 8) has r p (n, 8) triangle components; its interior is the independent 
set Y, with the component exteriors being the cliques X\, . . . ,X r . Similarly, G c (n, 8) has r c {n, 8) triangle 
components. The following lemma collects some observations about triangle components. 

Lemma 10. Let G be an n-vertex graph with minimum degree 8 > n/2. Then 

(a) each triangle component C of G satisfies \C\ > 8, 

(b) for distinct triangle components C, C' we have e(d(C), d(C')) = 0. 

(c) for each triangle component C, each vertex u of C , and U := {v. uv £ G} the minimum degree in G[U] 
is at least 26 — n and hence \G[U]\ > 25 — n + 1. 



Proof. To sec (a) let M be the vertices of a maximal clique in C (clearly \M\ > 3). If u and v are in M, 
and x is a common neighbour of u and v, then x is also in C. Thus vertices of G \ C are adjacent to at most 
1 vertex of M and vertices of C are adjacent to at most \M\ — 1 (by maximality) vertices of M. This gives 
the inequality 

|M|£< J2 «<£(|M|-l) + £l 

meM x£C x<£C 

and hence |M|<J — n < (\M\ — 2)\C\. Since n < 28 we have |G| > 8 as required. 



The assertion (b) follows from the fact that T(u,u') ^ for u <E d{C) and u' <G d{C) and thus, if 
uu' was an edge, u and u' would be in some triangle component C" contradicting the fact that they are 
in the exterior. Moreover, for an edge uv of C we have T(u,v) C C as C is a triangle component. Since 



T(u,v)\ > 28 - n we get (c) □ 



Let us sketch the proof of Lemma HJ Lemma [T(K a ) | states that triangle components cannot be too small. 
However, it is not solely the size of the triangle components we are interested in: We want to find a triangle 
component that contains many vertex disjoint triangles. At this point, Lemma llC|l^c )| comes into play. It 
asserts that certain spots in a triangle component induce a graph with minimum degree 28 — n. In the proof 
of Lemma [8] we shall usually use this fact in order to find a big matching M in such spots (Proposition [Tl] 
below asserts that this is possible). Clearly all edges in such a matching are triangle connected and hence 
it will remain to extend M to a set of vertex disjoint triangles. For this purpose we will analyse the size 
of the common neighbourhood T(u,v) of an edge uv in M. We will usually find that T(u, v) is so big that 
a simple greedy strategy allows us to construct the triangles. For estimating T(u,v) we will often use the 
following technique: We find a large set X such that neither u nor v has neighbours in X. This implies 
|r(u, v)\ > 28 — (n — \X\). Observe that Lemma [TC^ b ) \ implies that d(C) can serve as X if both u, v G d(C') 
for some triangle components C and C' . 

The strategy we just described works for most values of 8 below |n (we describe the exceptions below). 
For 6 > |n however, the greedy type argument fails, the reason being that we usually bound the common 
neighbourhood of an edge used in the argument above by 48 — 2n. But for 8 > |n we might have sp(n, 8) > 
48 — 2n. We solve this problem by using a different strategy in this range of 8. We will still start with a big 
connected matching M as before, but use a Hall-type argument to extend M to a triangle factor T. More 
precisely, we find M in the exterior of some triangle component and then consider for each edge uv of M 
all common neighbours of uv in int(G). The Hall-type argument then permits us to find distinct extensions 
for the edges of M. To make this argument work we use the fact that in this range of 8 the set int(G) is an 
independent set. 

We indicated earlier that there are some exceptional values of 8 that require special treatment: values 
of 8 around |n and |n. Observe that in both ranges the number of triangle components of G p (n,S) 
increases (from 2 to 3 for |n, and from 3 to 4 for |n) and thus the value sp(n, 8) as a function in 8 jumps. 
Roughly speaking, the reason that these two ranges need to be treated separately is that again sp(?i, 8) is 
not substantially smaller than 48 — 2n here, but we also do not know now that int(G) is an independent set. 
For dealing with these values of 8 we will use a somewhat technical case analysis which we provide at the 
end of this section. 
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As explained above we will apply the following simple observation about matchings in graphs of given 
minimum degree. 

Proposition 11. Each graph G = (X,E) with minimum degree 8 has a matching covering 2min(<5, |JA|/2J) 
vertices. 

Proof. Let M be a maximum matching in G and assume that M contains less than min((5, |JA|/2J) edges 
and that there are two vertices x,y G X not covered by M. Clearly, all neighbours of x and y are covered 
by M and thus there is an edge uv in M with xu, yv £ E. But then x, u, v, y is an M-augmcnting path, a 
contradiction. □ 

Before turning to the proof of Lemma [8] let us quickly collect some analytical data about sp(n, 8) and 
r p (n, 8) =: r. It is not difficult to check that 

(r + l)n — r „ rn — r + 1 n — 8 8 + 1 

—, — < 8 < and — < r < . (2) 

2(r + 1) - 1 ~ 2r - 1 28-n+l ~ 28-n + l w 

For the proof of Lemma [S] it will be useful to note in addition that for all 8,8' > j + [in with [x > fixed and 
8' such that we have r p (n,8') > 3 and either r p (n,8') > 5 or r p (n,8') = r p (n,8' + nn), for r\ < rj = rjo(fj,) 
sufficiently small and for n > uq = no(n) sufficiently large we have, if sp(n, 8 + rjn) < then 



sp(n, 8 + nn) < § min ( ^-7^— — t _ v ZT7Zr~x~i — 2 



8 + Zr]n 

j p (n,8 + r)n)' r p (n,S + nn) "J* (3) 
sp(n, 8 + rjn) < 68 — 3n, and sp(n, 8' + r/n) < AS' — 2n, 



which follows immediately from the definition of sp(n, <5) in ([T]) (see also Figure [T]). 

Proof of Lemma\^ Given /i choose r/o < small enough such that (J3j) holds for all 8 > % + [in. For 77 < ?yo 
let no be large enough for and such that r/no > 2. Define r := r p (n, 8) and r' := r p (n, 8 + rjn). 

If G has only one triangle component then Theorem [3] guarantees that CTF(G) > 6<5 — 3n and so we 



are in Case (SI) Thus we may assume in the following that G has at least two triangle components. Then 
Lemma \1(\{ a ) | implies that int(C) ^ for any triangle component C. 

Suppose that C is a triangle-component of G which does not contain a copy of K4. Let u £ int(C), and 
U := {v. uv e C}. By Lemma[TU]the vertex u does exist, and S(G[U}) > 28 — n. Because C contains no copy 
of K4, U contains no triangle. It follows that \U\ > 2(28 — n), and so by Proposition [TT1 the set U contains a 
matching M with 28 — n edges. Finally we choose greedily for each e S M a vertex v S V(G) such that ev is 
a triangle. Since U is triangle free all these vertices must lie outside U, and since |r(e)| > 28 — n we cannot 
fail to find distinct vertices for each edge. This yields a set T of 28 — n vertex-disjoint triangles which are all 
in C . So CTF(G) > 6<5 — 3n and we are in case [(STj| Henceforth we assume that every triangle-component 
contains a copy of K4. 

We continue by considering the case 3 " 5 ~ 2 < 8 < 3 . The following observation readily implies the 
lemma in this range as we will see in Fact [5] 

Fact 1. // G has exactly 2 triangle components, (| — 2rf)n < 8(G), int(G) is independent, and either 
I int(G)| < n — 8 — llrjn or the exterior X of the larger triangle component satisfies \X\ > j§(25 — n), then 
CTF(G) > min(sp(n, 8 + nn), 

To see this, note that by Lemma llC^fc )| a vertex x £ X cannot have neighbours in the exterior of the 
other component and so T(x) CIU int(G) which implies 8(G[X]) > 8 — \ int(G)|. By Proposition fTTI there 
is a matching M in G[X) covering 2min(<5 — |int(G)|, [|^|/2J) vertices. Further, every edge of M has at 
least a := 2(8 — \X\) — |int(G)| common neighbours in int(G) and every vertex u £ int(G) sends at least 
t := 8 — (n — \X\ — I int(G)|) edges into X. Therefore it is a common neighbour of at least b :— t — \M\ > 
t - [\X\/2\ + 5nn edges of M. We have a + b>38-n- 2\X\ + [\X\/2\ + 5r?n > LI^I/ 2 J ~ > \M\ 
because (| — 2rf)n < 8 and \X\ < n — 8 by Lemma ll(]^a)| Using Hall's theorem it is easy to verify that any 
bipartite graph H with partition classes A and B and such that vertices in A and B have degree at least a 
and b, respectively, satisfies the following. If a + b > min{|A|, \B\} then H has a matching of size at least 
min{|A|, \B\}. Applying this observation to the bipartite graph with A := M and B := int(G) and edges ev 
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for all common neighbours v of e we conclude that there is a set of vertex disjoint triangles in int(G) U X 
which cither covers int(G) or includes all edges of M. This gives CTF(G) > 3min(5 - | int(G% |JA|/2J - 
5nn, | int(G)|) > 3min(2(5-n, |J*|/2j - 5r/n) since |int(G)| > 28 -n by Lemma Ego)] and |int(G)| <n-6. 
Hence ([3]) implies CTF(G) > min(sp(n, <5 + rjn), |jj7i) unless 

3([^rJ — 5?yn) < min ( sp(n, 8 + r/n), ^nj . (4) 

If sp(n, 5 + rpi) < |jjn then ([J) implied that \X\ — 10nn was less than the largest exterior set of G p (n, 8 + nn) 
which is of size at most |<5 + 1. Consequently \X\ < ^8+1 + 10nn < jfi(2S — n) because 8 > (| — 277)71, and 
|int(G)| > | int(G p (n, S+i]n))\ — lOrjn = n — 5 — llr/n, a contradiction. On the other hand, sp(ti, <5+777i) > i^n 
can only be the case if 5 > (| — 277)71. But then |X| < n— 8 < y§(2<5 — n), and | int(G)| > n — 2\X\ together 
with |D) is not consistent with | int(G)| < n — 5 — linn, a contradiction. 

Fact 2. LemmaMis true for '^f^ < 8 < 2^=1. 

Observe that in this range r — 2. Assume G has an edge uv in int(G), let a; be a common neighbour 
of u and v and G be the triangle component containing ux and vx. Then there are edges uy and vz of 
G outside G. The sets r(it,j/), r(i>, z) and {u, w, x, y, z} are pairwise disjoint, and x is not adjacent to 
T(u, y)UT(v, z) U {y, z}. So 5 < d(x) < (n — 1) — 2(25 — n) — 2 which is only possible when 8 < (3n — 3)/5, a 
contradiction. Thus int(G) is an independent set, which implies | int(G)| < n — 8. Hence G cannot have three 
triangle components by Lemma ll(](a)[ In particular, all vertices in int(G) lie in both triangle components 
of G. So if I int(G)| > n — 5 — llr/n then int(G) is the desired large independent set. If moreover all triangle 
component exteriors are of size y§(2<5 — n) at most we arc in Casc |(S3")| Otherwise (if int(G) is small or an 
exterior is large) Fact Q] gives CTF(G) > min(sp(n, 5 + nn), §n) which is Casc [(S2)| 

Now suppose S < n ~ and accordingly r > 3 and r' > 2. For dealing with this case we first establish 
three auxiliary facts. The first one captures the greedy technique for finding a large connected triangle factor 
that we sketched in the beginning of this section. We will use this technique throughout the rest of the proof. 

Fact 3. If there are two sets XJ\,XJi C V(G) such that no vertex in U\ has a neighbour in U2, all edges in 
G[Ui] are triangle connected and 8(G[Ui]) > 8 X then CTF(G) > min(3L|{/i|/2j , 3<Ji, 28 - n + \U 2 \). 

By Proposition [TT1 we can find a matching M in U\ covering min(2L|f/i|/2j , 28\) vertices. Now for each 
edge e € M in turn we pick greedily a common neighbour of e outside both M and the previously chosen 
common neighbours to obtain a set T of disjoint triangles. For any 2, y <E U\ we have \T(x, y)\ > 28— (11— I^D- 
Hence T covers at least min(3 L|t/i|/2j , 35i, 28 — n+ \U 2 \) vertices. Note further that T is a connected triangle 
factor because all edges in G[C/i] are triangle connected. 

Fact 4. Let uv be an edge in int(G). Unless r' = 2 at least one vertex, u or v, is contained in at most r' — 1 
triangle components. 

Indeed, let C\ be the triangle component containing uv £ int(G) along with the (non-empty) common 
neighbourhood T(u, v) (and perhaps some other neighbours of u or v separately). Assume that both u and 
v live in r' — 1 other triangle components. Together with Lemma [LCfT c ) | this implies that (L(it) U T(v )) \ Gi 
contains in total at least 2(r' — 1) mutually disjoint sets (since common neighbours of u and v are in C\) of at 
least 28 — n + 1 vertices each. Let their union be U. Given any x € T(m, v), since ux and vx are both in Gi, 
x cannot be adjacent to any vertex of U. But then 8 < d{x) < n — (2r' — 2) (28 — n + 1) which is equivalent 
to 2r' — 2 < (71 — 5)/ (28 — n + 1). By the right-hand side is at most r and thus we get 2r' — 2<r. Since 
r < r' + 1 however this is a contradiction unless r' < 2. 



We assume from now on, that CTF(G) < sp(n, 8 + nn), that is, we are not in Cases (SI) or |(S2)| Our 



aim is to conclude that then (*) int(G) is an independent set and that its vertices are contained in at least 
r' triangle components. It turns out, however, that we need to consider the cases r = r 1 + 1 = 2 and 



r = r 



1 = 3 (i.e., the cases when the minimum degree 8 is just a little bit below |n and jn, respectively) 
separately. Unfortunately these two cases, which are treated by Fact [5J require a somewhat technical case 
analysis which we prefer to defer to the end of the section. 

Fact 5. If r = r' + 1 = 3 or r = r' + 1 = 4 then int(G) is an independent set all of whose vertices are 
contained in at least r' triangle components. 
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Assuming this fact is true wc can deduce (*) for all values r > 3 as follows. 

Fact 6. The set int(G) is an independent set (and hence of size at most n — 8) all of whose vertices are 
contained in at least r' triangle components. 

The cases r = r' + 1 = 3 and r = r' + 1 = 4 arc handled by Fact [5] So we assume we are not in these 
cases. We will show that then each vertex of int(G) is contained in at least r' triangle components. Once 
we established this, Fact |4] implies that there are no edges in int(G) and so int(G) is an independent set as 
desired. 

To prove that each vertex of int(G) is contained in at least r' triangle components we assume the contrary 
and show that then CTF(G) > sp(n, 8 + rjn), a contradiction. Indeed, let w £ int(G) and Ui, . . . , £4 be the 
neighbours of w partitioned by the triangle-component of the edge to w, where there are k > 1 components 
containing w. By Lemma ll(jl^c )| we have S(G[Ui\) > 25— n and \Ui\ > 26—n+l. Suppose that U\ is the largest 
of the C/j's. No vertex in U\ has a neighbour in U 2 (since the components are distinct) and all edges in G[C/i] 
are triangle connected (because U\ C T(w)). Therefore Fact [3] implies that there is a connected triangle factor 
T in G covering min(3L|£/i|/2j,3(2<$-n),2£- n+ \U 2 \) > min(3|Jf7i |/2J , 48 - 2n) vertices. Ifw lies only in 
r' — 1 triangle components then \U\\ > S/(r' — 1) and therefore T covers at least min(3J/ (2r' — 2) — 1, 48 — 2n) 
vertices. This connected triangle factor is not smaller than that found in G p (n,6 + nn), since © and the 
choice of 770 and no imply |<5/ (r' — 1) — 1 > sp(n, 6 + nn) and 46 — 2n > sp(n, 6 + rjn) as r > 3 provided 
that either r > 5 or r = r'. This assumption is fulfilled as the cases r = r' + 1 = 3 and r = r' + 1 = 4 are 
excluded. 

Fact 7. We are m Case | (S3) [ 

To conclude this we will show that |int(G)| = a > n — 5 — linn and \X\\ < j^(26 — n) for the 
biggest exterior X\ in G. Suppose for a contradiction that this is not the case. An easy calculation shows 
that this forces G to have exactly r' triangle components. Indeed, assume G has at least r 1 + 1 triangle 
components. If a < n — 6 — llrjn then each of these components G has vertices in the exterior 9(G) and 
so by Lemma IK)!' 6)| the minimum degree of G implies | <9(C) | > 6 — a. Accordingly (r' + 1)(6 — a) + a < n 
which implies (r' + 1)5 < (n — a) + (r' + l)a < n + r'(n — 6 — linn). Straightforward manipulation gives 
5 + rjn < ((r' + l)n-r?n(9r'-l))/(2(r' + l) - 1). Since nn(9r' - 1) > 9r'-l > r' th is contradicts Q applied 
to r' = r p (n, 6 + nn). If \Xi \ > j|(25 — n) on the other hand we use Lemma [T^ a ) | and Lemma [TI^b)| to get 
8 + i§(25 - n) + (r' - 1){26 - n+ 1) < n. By © we have r' >(n-8- r)n)/{25 + 2nn -n+ 1). Combined 
with the last inequality this gives 



1 L oW-n) + (n-S-vn) 2S _ n+1 + 2r]n <n-8 

which is a contradiction for 8 > (^ + /i)n (and n small enough). Hence G has exactly r' triangle components. 

Now, if r' = 2, and accordingly 6 > (| — 2r/)n, then Fact Q] implies | int(G)| > n — 8 — llijn and 
l^il < i§(2<5 - n) because CTF(G) < sp(n, <5 + 7771) and so in the remainder we assume r' > 2. There are at 
most 6 + linn vertices outside int(G). Since X\ is the largest exterior it follows that |Xi| > [6 + \Vnn)jr' . 
To bound 8(G[Xi]) from below notice that any vertex x € X\ either has 8 neighbours in X\ or a neighbour 
w in int(G). In view of the second case the fact that int(G) is independent and Lemma 1 1C|I^ c ) | imply that 
<5(G[Xl]) > min{2(5 — n, 8} = 25 — n. Moreover, there are at least r' — 1 component exteriors other than X\ 
in G, each of size at least 25 — n + 1 by Lemma llC^c Jj and so there is a set X2 with \X% \ > (r' — 1) (25 — n+1) 
such that no vertex in X\ has a neighbour in X 2 . By Fact [3] we thus get CTF(G) > min(3[|A^i|/2j , 3(25 — 
n),2S-n+\X 2 \). Note that 26-n+\X 2 \ > 25- (n- (r'~ l){25-n + l)) = r'(25-n + l)-l > 65 -3n because 
r' > 2. Further, by ([3]) and the choice of 770 and no we have 3[|^i|/2j > ^7(8 + linn) — 2 > sp(n, n + im) 
and 6(5 — 3 > sp(n, 77 + 7777) and so CTF(G) > sp(n, 5 + nn) which contradicts our assumption. □ 

To complete the proof above it remains to show Fact [5] Note that we can use all facts from the proof of 
Lemma [8] that precede Fact [5] Wc will further assume that all constants and variables are set up as in this 
proof. 

The proof of Fact\^ Recall that we assumed that CTF(G) < sp(n, 5 + nn) in this part of the proof of 
Lemma [5J 
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We first concentrate on the case r = 3 and r' = 2. In this case 5(G) G [(| — 277)71, (| + 77)71]. Trivially 
each vertex of int(G) is contained in at least r' — 2 triangle components. Assume there is an edge uv in 
int(G), let x be a common neighbour of u and v, and G be the triangle component containing the triangle 
uvx. Let U\ = {y ■ uy G G} and V% = {y : vy G G}, set {72 = r(u) \ {7i and V2 = r(w) \ Vi, and define 
W u = V(GQ \ ({7iU{7 2 uy 2 ) and = V(G) \ (V 1 UV 2 UU 2 ). 

By definition x has no neighbour in [/2UV2. It follows that | C/2UV2 1 < n — 5. On the other hand, by 
Lemma ll(|tc)[ we have |{7a|, | V2 1 > 25 — n > -|n — 47771. This implies IC/2I < n — 5 — | V2 1 < 2n — 35 and 
IV2I < 2n — 35. Since neither u nor a; have neighbours in V 2 we have 

|r(x,u)| >25-(n- \V 2 \) > §rc- 8?/??. > n - (5 - IO7771. 

No vertex y G {72 is adjacent to any vertex in r(x, w). We conclude that |r(a;, u)| < n — 5 and hence 
\W U \ > n — 1 1/^ 2 1 — I V2 1 — \T(x, u)\ > \n — 14ijn. In addition each vertex y G U% has at most IO7771 non- 
neighbours in U 2 UV 2 UW U . By symmetry also every vertex y G V2 has at most IO7771 non-neighbours in 
U 2 UV 2 UW V . But then it is easy to cover all but at most 1 vertex of ?72 with a matching (cf. Proposition ITTj) 
and all but at most 1 vertex of V 2 with another matching. This gives a matching M u covering at least 
gTi — 5rjn vertices of U 2 . Because each vertex of U 2 has at most lOrjn non-neighbours in W u we can then 
extend the matching edges in M u with vertices from W u to obtain a set T u of (triangle connected) vertex 
disjoint triangles covering at least — lOOr/n) vertices. Repeating the same with a matching M v in V 2 

and vertices from W v not used in T u gives a connected triangle factor T v covering \(\n — IOO7771) vertices. 
Since, moreover, vertices in U 2 have at most lOryn non-neighbours in V 2 any two vertices of U 2 clearly have 
an edge of V 2 in their common neighbourhood. Thus T u and T v together form a connected triangle factor 
covering at least 3(^n — lOOr^n) which is larger than sp(n, 5 + rjn) < ^n, a contradiction. 

Now assume that r = 4 and r 1 = 3. Then 5(G) G [(| — 2rj)n, (| + rf)n], and consequently sp(n, 5 + 7]n) < 
(|+277)n. Assume first there is some vertex u G int(G) such that u is in exactly r' — l — 2 triangle components 
C and C and let U and U' be the set of neighbours of u on edges in C and C, respectively, with \U\ > \U'\. 
Applying exactly the same strategy as in the proof of Fact [6] in Lemma [8] wc obtain a triangle factor covering 
at least min(3[J{7|/2j , 3(25 — n),25 — n + |{7'|) vertices. Because \U\ > \5 we conclude from © that this 
triangle factor covers at least (I + 2rj)n vertices if \U'\ > (7 + 6rj)n. Similarly, if a vertex u has three sets of 
neighbours Ui, U 2 , U3 on edges in three different triangle components of G then we obtain a triangle factor 
covering at least min(3|J{7i |/2J , 3(2^ — n),25 — n + \U 2 \ + \U^\) vertices. This is larger than (| + 2rj)n if 
\Ui\ > (tjj + 2r\)n. Hence we can assume from now on that the following holds. 

("J") If u has sets of neighbours {7, U' on edges in exactly two different triangle components with \U\ > \U'\ 

then (i - 4r))n < \U'\ < (| + 6??)n and (f - 877)71 < \U\ < (f + 27/)n. 
(H) If u has sets of neighbours U\, U 2 , U3 on edges in exactly three different triangle components then 
+ 277)71 > It/il > (21 - 677)71 for i G [3]. 
Next we show that int(G) is an independent set. Assume for a contradiction that there is an edge 
uv G int(G). By Fact Q] one of the vertices of this edge, say u, is only in 2 triangle components; let its 
neighbours be Ui and U 2 in these two triangle components, and let the neighbours of v be partitioned 
into sets Vi,...,Vk according to the triangle component containing the edge to v. Assume further that 
T(u, v) C Ui n V\. Let x G T(u,v). If we had \U 2 \ > (I — 877)71 then, since x has neighbours in neither U 2 



nor V2, and | V2 1 > (\ — 477)71, we would have d(x) < (| + 1277)71 which is a contradiction. So b y|(^) we have 



(i - 477)71 < |{/ 2 | < (i + 677)71. Similarly, if fc > 3 then byfpjjwe have |C/ 2 U F 2 U • • • U Vfc| > (|| - 877)71 and 



again d(x) < (j( + 877)71 is a contradiction. It follows that k = 2 and (y — 4»7)n < |V^| < (= + 677)71. 

All vertices in U 2 (respectively, V 2 ) have at most IO7771 non-neighbours outside U\ (respectively, V\). 
Accordingly we obtain a situation similar to the one we established in the case r = 3 and r' = 2 above. 
We proceed similarly as there and just sketch the argument here: U 2 and V 2 induce almost complete graphs 
in G (i.e., for each vertex at most IO7771 edges are missing). Hence we can find matchings M u and M v 
in i72 and V 2 , respectively, that almost cover these sets. Moreover almost all edges between U 2 and W u = 
F(G)\({7iU{72UV2) are present and \W U \ > |n— 1 47771 is almost twice as big as U 2 . Accordingly we can use W u 
to extend M u to a collection of vertex disjoint triangles T u . Similarly we can use W v = V(G)\(C/2UViUV2) to 
extend M v to a collection of vertex disjoint triangles T v , avoiding T u . Since also almost all edges between U 2 
and V 2 are present T U UT V forms a connected triangle factor covering at least ||{72| + ||V2|— 8O7771 > (=— W0rj)n 
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vertices. With this contradiction to the size of CTF(G) we have established that int(G) is an independent 
set. 

It remains to show that each vertex u G int(G) is contained in at least 3 triangle components. Assume 
for a contradiction that this is not the case and u is only contained in 2 triangle components G and C 
and let U and U', respectively, be the neighbours of u on edges in G and C , Without loss of generality 
\U\ > \U'\. Because int(G) is an independent set U and U' are contained in the exteriors of G and G'. 
It follows that there are no edges between U and d(C), and exactly the same argument that we used to 
show above now implies that (j — 47771) < \d(C')\ < {j + 677)71. Since vertices in d(C) meet only edges 
in G' we conclude that | int(G)| > (| — 877)77,. It follows that all but at most 87771 vertices of G are contained 
in UUU'U int(G). Moreover, as int(G) is independent, | int(G)| < (| + 277)77, and each vertex in int(G) has at 
most 10?7n non-neighbours in U. In addition, because vertices in U have no neighbours in U' they have at 
most I int(G)| + 87777, < (| + 1077)7-7, neighbours outside U. It follows that S(G[U]) > (j — I2rf)n. Thus we can 
proceed as before and do the following: We use Proposition [TT] to find a matching M in U covering at least 
(| — 2477)71 vertices of U. Then we extend this matching with vertices from int(G) and obtain a connected 
triangle factor covering at least (| — lOOr^n vertices, a contradiction. □ 



4 Near-extremal graphs 

In this section we provide the proof of Lemma [5] To prepare this proof we start with two useful lemmas. 
The first will be used to construct our squared paths and squared cycles from simple paths and cycles. 

Lemma 12. Given a graph G, let T = (t\, t 2l . . . , t 2 {) be a path in G and W a set of vertices disjoint from 
T. Let Qi = (ti,t 2 ), Qi = (t2i-3)*2i-2)<2t-i,*2i) for all 1 < % < I, and Q l+1 = (t 2 i-i,t 2 i). If there exists 
an ordering a of [I + 1] such that for each i, has at least i common neighbours in W , then there is a 

squared path (<?x, t\, t%, Q21 Q3i ■ ■ ■) ^ n G, with qi S W for each i, using every vertex of T . 

If T is a cycle on 21 vertices we let instead Qi = (i 2 i— 1 5 ^2Zi ti, ^2), Qi — (^2i- 3-1^21-21^21-1^2%) f or a ^ 
1 < i < I, and a be an ordering on [I]. Then, under the same conditions, we obtain a squared cycle Gg ; . 

Proof. We need only ensure that for each i, qi is a common neighbour of Qi and the qi are distinct. This is 
possible by choosing for each i in succession q a ^ to be any so far unused common neighbour of Q a (i)- □ 

The second, a variant on Dirac's theorem, permits us to construct paths and cycles of desired lengths 
which keep some 'bad' vertices far apart. 

Lemma 13. Let H be a graph on h vertices and B C V(H) be of size at most h/100. Suppose that every 
vertex in B has at least 9\B\ neighbours in H, and every vertex outside B has at least h/2 + 9|S| + 10 
neighbours in H . Then for any given 3 < t < h we can find a cycle Ti of length t in H on which no four 
consecutive vertices contain more than one vertex of B. Furthermore, if x and y are any two vertices not in 
B and 5 < I < h, we can find an i-vertex path Ti whose endvertices are x and y on which no four consecutive 
vertices contain more than one vertex of B U {x,y}. 

Proof. If we seek a path in H from x to y then we create a 'dummy edge' between x and y. If we seek a 
cycle, let X be any edge of H — B. 

First we construct a path P in H covering B with the desired property. Let B = {b\, 62, ... , &ib|}- For 
each 1 < i < \B\ — 1, choose a vertex Uj € H — B adjacent to bi and a vertex Vi E H — B adjacent to 
bi+±. Because both u, and Vi have h/2 + 9\B\ + 10 neighbours in H, they have at least 18|£?| + 20 common 
neighbours. At most 3\B\ of these are either in B or amongst the chosen Uj,Vj, and so we can find a so far 
unused vertex Wi adjacent to it, and Wj; since we require only \B\ — 1 vertices Wi, . . . , W\b\—i we can pick the 
vertices greedily. 

We let vq be yet another vertex adjacent to 61, and ui^i adjacent to b\s\, and choose any further vertices 
wq,vq,w\b\i u \b\ such that 

P = (x,y,uo,w ,v ,b 1 ,u 1 ,w 1 ,v 1 ,b 2 , . . . , U|b|-i, b\ B \, u\ B \ , w\ B \ , v \ B \) 

is a path on i\B\ + 5 vertices. 
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Now we let P' be a path extending P in H of maximum length. We claim that P 1 is in fact spanning. 
Suppose not: let u be an end- vertex of P' and v a vertex not on P'. Since P' is maximal every neighbour of 
u is on P', so v(P') > h/2 + 9\B\ + 10. If there existed an edge u'v' of P 1 — P with u'u and v'v edges of H , 
with v' closer to u on P' than v! , then we would have a longer path extending P in P. Counting the edges 
leaving u and v yields a contradiction. 

Finally we let u and v be the end- vertices of the spanning path P 1 . If uv is an edge of H , or if itV is an 
edge of P' — P (with u' nearer to u on P' than v') such that uu' and u'v are edges of P, then we obtain a 
cycle T spanning H and containing P as a subpath. Again edge counting reveals that such an edge must 
exist. 

To obtain a cycle Ti with h — \B\ — 2 < £ < h we take u to be an end- vertex of the path T — P and v 
its successor on T — P. If we can find two further vertices v! and v' on T — P (in that order from u along 
T — P) with h — £ vertices between them and with uu' and vv' edges of H then we would obtain a cycle Tg 
of length I. Again simple edge counting reveals that such a pair of vertices exists. To obtain a cycle Tg with 
3 < £ < h - \B\ - 2 wc note that H - B has minimum degree h/2 + 8\B\ + 10 > (h - \B\)/2 + 1 and thus 
contains a cycle of every possible length using the edge xy. 

The cycle Ti satisfies the condition that no four consecutive vertices contain more than one vertex of B, 
since either it preserves P as a subpath or it contains no vertices of B at all; similarly the path from x to y 
within Ti satisfies the required conditions. □ 

Before embarking upon the proof of Lemma [5] we give an outline of the method. Wc recall that the 
Szemeredi partition supplied to the Lemma is essentially the extremal structure: our task is to show that 
the underlying graph cither has the same structure or possesses features which lead to longer squared 
paths and cycles than required for the conclusion of the Lemma. This is complicated by the fact that the 
Szemeredi partition is insensitive both to mis-assignment of a sublinear number of vertices and to editing of 
a subquadratic number of edges: we must assume, for example, that although the vertex set I in the reduced 
graph R is independent, the vertex set (J I in G may contain some vertices with very high degree into IJ I, 
may fail to contain some vertices of G with no neighbours in (J I, and may contain a sublinear number of 
edges meeting every vertex. Fortunately, it is possible to reassign vertices in this case by separating those 
vertices with 'few' neighbours in [J I, which we shall collect in a set W, and those with 'many'. We are 
then able to show (as Fact [8] below) that, if there are two vertex disjoint edges in W, then the sets (J Pi 
and (J B2 are in the same triangle component of G ('unexpectedly', since B\ and P2 are in different triangle 
components in R). We shall show that in this case it is possible to construct very long squared paths and 
cycles by making use of (J B\ and (J P2 . 

Hence we can assume that there are not two disjoint edges in W, which in turn implies that W is almost 
independent and will give us rather precise control about the size of W . In addition, the minimum degree 
condition will guarantee that almost every edge from W to the remainder of G is present. We would like to 
then say that in V(G) \ W we can find a long path, which together with vertices from W forms a squared 
path (and similarly for squared cycles). Unfortunately since G[W, V(G) \ W] is not necessarily a complete 
bipartite graph, this statement is not obviously true: although by definition no vertex outside W has very few 
neighbours in W, it is certainly possible that two vertices outside W could fail to have a common neighbour 
in W . But the statement is true for a path possessing sufficiently nice properties — specifically, satisfying 
the conditions of Lemma [12] — and the purpose of Lemma [13] is to provide paths and cycles with those nice 
properties. The remainder of our proof, then, consists of setting up conditions for application of Lemma [TBI 

Proof of Lemma [P] If 6 > 2 "~ 1 then we appeal to Theorem [T] to find a spanning squared path in G. If 
5 > we appeal to Theorem [2] to find Cg for each £ £ [3, n] \ {5}. Hence we can assume in the following 
that 5 < ^f- (which implies that there are at least two Pi) and that r p (n,S) > 2 and r c (n,S) > 2. 
Let 5' = 5(R) > (S/n -d- e)m. Observe that 



because clusters in I have 6' neighbours outside / in R. For i £ [k], fix a cluster C E P^. Since 5' < deg(C) = 
deg(C, B t U I) < deg(C, B t ) + |/| < deg(C, B t ) + m - we conclude that 



|7| < m - 8' < (1 



5/n + d + e)m , 



(5) 



\Bi\ > 2m(26-n)/(3n) . 



(6) 
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Fact 8. If u\V\ and u^vi are vertex disjoint edges of G such that the edge UiVi has at least 8 — (28 — n)/16 
common neighbours outside (J I for i = 1, 2, then G contains P^ p i n ^ and C 2 for each i G [3, sc(n, 8)] \ {5}. 

Indeed, let D be the set of clusters C G B\ such that either u\V\ or U2V2 has at most 2dn/m common 
neighbours in C. Using ([5]) and the hypothesis on wifiand U2V2 in Fact [5] we get \D\ < (25 — n)m/(7n). 
Therefore, we conclude from © that Bt\D ^ 0. Take X G B x \ D arbitrarily. We have degpf, B{) > 
deg(A) - |/| > 8' - \I\ > \D\, using ©. Thus there exists a cluster Y G F(X) n (B x \ D). Similarly, we can 
find clusters X',Y' G B2, such that X'Y' G E(R), and each of the clusters X',Y' contains at least 2dn/m 
common neighbours of both u\V\ and W2«2- 

Since 8r(Bi), 8r(B2) > 8' — |/|, we can find greedily a matching M in R[B\ U -B2] with 8' — \I\ edges. 
Since every cluster in / has at most m—\I\— 8' non-neighbours outside /, every cluster in I forms a triangle 
with at least \M\ — (m — \I\ — 8') edges of M. Since 8' — |/| < |/|, we may choose greedily clusters in / to 
obtain a set T of at least 28' — m vertex-disjoint triangles formed from edges of M and clusters of /. Let Ti 
be the triangles of T contained in B\ U /, and T2 those contained in B2 U /. Observe that all the triangles in 
T\ are in the same triangle component as the edge XY, and all the triangles in T2 are in the same triangle 
component as the edge X'Y'. 

We can apply Lemma [7] with X\ = X2 = X, Y\ = Y% = Y to find a squared path starting with u\V\ and 
finishing with U2V2 using the triangles T\. Similarly, using Lemma [7| with X\ = X2 = X', Y\ = Y2 = Y' 
we find a squared path (intersecting the first only at u\, v\, U2 and V2) starting with U2V2 and finishing 
with U1V1 using the triangles T2. Concatenating the two squared paths we have a squared cycle Cj in G, 
where we may choose the lengths of the squared paths such that 3(m EL + 2) 2 < £ < 3(1 — <i)(2i5' — m)n/m. 
Applying Lemma [7] to the copy of K4 in B\ directly we obtain C\ for each I e [3, 3n/m] \ {5}. It follows 
that G contains both P s 2 p( - n g \ and Cj for each £ € [3, sc(n, 8)] \ {5} as required. This concludes the proof of 
Fact El 

Let us next examine the size of W. To simplify notation, we set £ = \/e + d + fi. Let W be the vertices 
of G which do not have more than £n neighbours in (J/. We infer from the fact that / is independent and 
from the definition of the reduced graph that | — W\ < en. Recall that \I\ > (n — 8 — /.m)m/n. Every 
edge in W has at least 2(8 — £n) — (n — \ [J I\) > 8 — (28 — n)/16 common neighbours outside 1J /. If there 
are two disjoint edges in W then we are done by Fact [5] Thus assume that no such two edges exist. It 
follows that there are two vertices in W which meet every edge in W, and since neither has more than £n 
neighbours in 1J / there is a vertex in W adjacent to no vertex of W. We conclude that 

n - 8 - /in - 2en < \ [J I\ - en < \W\ < n - 8. (7) 

Our next goal is to extract from each set 1J Bi a large set Ai of vertices which are adjacent to almost all 
vertices in W and are such that GL4i] has minimum degree somewhat above |>L|/2. For this purpose we 
first show that most vertices in Bi have many neighbours in Bi. Because at least \W\8 — 2\W\ edges leave 
W, the average number of edges from a vertex v G V(G) \ W going to W is at least 

n — I W I + urt + 2en 

In particular, at most £ 2 n vertices outside W have less than \W\ — £ 2 n neighbours in W . Furthermore, if 
there existed en vertices in 1J Bi which all have at least 2dn neighbours in G — W — 1J Bi then by averaging 
and regularity there would exist a pair of clusters, one in Bi and the other in Bj for some j ^ i, adjacent in 
R. It follows that all but at most en vertices of [J Bi have at least 

8-\W\-2dn>\\jBi\/2 + 32fn (8) 

neighbours in Bi. 

Now, for each i G [k] we let Ai be the set of vertices in [JBi which are adjacent to at least \W\ — £ 2 n 
vertices of W and to at least 1 1J Bi \ /2 + 32£ 2 n vertices of 1J Bi . The vertices which are neither in W nor any of 
the sets Ai must be in the original bin set Vq, removed from (J /, or removed from one of the sets IJ Ai. There 
are at most en + en + £ 2 n + ken < 2£ 2 n such vertices. Accordingly ([5]) implies that (5(GL4i]) > |^4i|/2 + 30£ 2 ?i 
and 

\A. l \>2S-n-3£, 2 n (9) 
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for each i £ [k] and hence Ai has the desired properties. 

In the remainder of the proof we utilize the sets A4 in order to find the desired squared path and squared 
cycles. We start by showing that we obtain squared cycles on £ vertices for each I £ [3, ||^4i|] \ {5}. To 
see this note first that by Lemma [TBI (with B = 0) we find in Ai a copy of Civ for each 11' £ [4, |^4i|]. By 
definition of A\ we can certainly apply Lemma [12] to square this cycle. This gives us squared cycles of the 
desired lengths divisible by three, but not of other lengths. 

If we seek a squared cycle C| f/+1 or C|^ +2 then we need to perform a process which we will call •parity 
correction and which we explain in the following two paragraphs. We shall use this parity correction process 
also in all remaining steps of the proof to obtain squared cycles of lengths not divisible by 3. 

For obtaining a squared cycle of length 3£' + 1 we proceed as follows. We pick a triangle abc in A\ and 
clone the vertex b, i.e., we insert a dummy vertex b' into G with the same adjacencies as b. Then we apply 
Lemma [TBI to Ai — {b} to find a path P = (a,p2,P3, • ■ ■ \P2l'-ii c ) on 2£' vertices whose end-vertices are a and 
c. Finally we apply Lemma [T2"l to the path bPb' , taking Qi = (b, a), Q 2 = (b, a,p2,py,) & s the first quadruple 
and thereafter every other set of four consecutive vertices on P, finishing with (p2i'-2,P'W-x,c,b'). This 
yields a squared path (qi, b, a, . . . , c, b') on 3(£' + 1) vertices, which gives a squared cycle (b,a, . . . , c) on 3£' + 1 
vertices as required. 

If we seek a squared cycle C^, +2 with £' > 1 on the other hand, then we perform a similar process, 
except that we identify not one triangle in Ai but two triangles abc, xyz connected with an edge ex. We 
apply Lemma [TBI to find a path P = (a, . . . , z) in A\ \ {b, c, y, z} on 2£' vertices. We then apply Lemma [T2l 
once to the path bPy and once to (6, c, x, y). Omitting the first vertex on each of the resulting squared paths 
and concatenating, we get a squared cycle Cf r+2 . 

Hence we do indeed obtain squared cycles G\ for all £ £ [3, \ {5}. It remains to show that we can 

also find Cf for all £ £ sc(n, 6)] and PjLr n g\- For this purpose, we first re-incorporate the vertices 

that are neither in W nor in any of the sets Ai by examining in which of the Ai they have many neighbours. 
More precisely, for each i £ [k], we let Xi be Ai together with all vertices in V(G) \ W which are adjacent 
to at least 30£ 2 n vertices of Ai. Because every vertex in V(G) \ W has at least 6 — \W\ neighbours outside 
W , every vertex in G — W is in Xi for some i. We finish the proof by distinguishing three cases. 

Case 1: \Xi n Xj\ > 2 for some i 7^ j. Let v\ and u 2 be distinct vertices of Xi n Xj. Let u\ and U2 be 
distinct neighbours in Ai of v\ and V2 respectively, and similarly y\ and 2/2 in Aj . Applying Lemma [13] to 
Ai we can find a path from u\ to u 2 containing any number from 4 to \Ai \ — 2 of vertices we desire. We can 
find a similar path in Aj from y\ to yi- Concatenating these paths with v\ and v 2 we can find a 2£'-vertex 
cycle T2V in X\ U X2 for any 10 < 2£' < \Ai \ + \ Aj\ — 2. There are no quadruples on T21' using both v\ 
and V2; the four quadruples that use one or the other each have at least (£ 4 ' 3 — 3£ 2 )?i > lOOfc common 
neighbours on W, while all the remaining quadruples have at least \W\ — 4£ 2 n common neighbours on W, 
so applying Lemma [T2l we obtain a squared cycle on 3£' vertices and (choosing 2£' > |j4,| + \ Aj \ — 10) a 
squared path on at least sp(n, 5) vertices. Again it is possible to perform parity corrections (prior to applying 
Lemma [TB|) so that in this case we have C\ C G for every £ £ [3, + \AA - 10)] \ {5}. By ©, we have 

sc(n,<5) < sp(n,<5) < |(|^| + \AA - 10). 

Case 2: for some i every vertex of Ai is adjacent to at least one vertex outside XiUW. Since \ Ai \ > 31A:^ 2 n 
we can certainly find 31£ 2 rt vertices in Ai all adjacent to vertices of Xj \ Xi for some j 7^ i. Since no vertex 
of Xj \ Xi is adjacent to 30£ 2 n vertices of Ai by definition of Xi, we find two disjoint edges and U2V2, 
u\,U2 £ Ai to v\,V2 £ Xj. Choosing distinct neighbours y\ of v\ and 2/2 of V2 in Aj and applying the 
identical logic to the previous case we are done. 

Case 3: \Xi (~l Xj\ < 1 and some vertex in Ai is adjacent only to vertices in W U Xi for all i 7^ j. Thus 
\Xi\ > 5 — \W\ + 1 for each i. We now first focus on finding a squared path on sp(n, 8) vertices in G and 
then turn to the squared cycles which will complete the proof. If for some i 7^ j we have Xi n X j 7^ then 
we obtain a squared path of the desired length as in Case 1. So, assume that the sets Xi are all disjoint. It 
follows that k<(n- \W\)/(S - |W| + 1). Since \ W\ < n - 5 by Q, this implies 

n-(n-S) S+l m 

~ 5-(n-5) + l ~ 25-n+l ^ r ^?M) , 

and the largest of the sets X i7 say Xi, has at least (n — \W\)/r p (n,S) > 5/r p (n,S) vertices. Note that 
S(G[Xi\) > 30^ 2 n while all but at most 2£ 2 n + fc vertices of X 1 have at least |Ax|/2 + 30^ 2 n > |A!|/2 + 25^ 2 n 
neighbours in X\. So we may mark the vertices of X\ — A\ as 'bad' and apply Lemma [TBI to G[Xi] (where B 
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contains all 'bad' vertices) to obtain a path T covering X\ on which every quadruple contains at most one 
'bad' vertex. Finally we apply Lemma [T2l to obtain a squared path on at least sp(n,<5) vertices. 

At last, we show that we can find in G the desired long squared cycles in Case 3. Assume first that 
there is a cycle of sets (relabelling the indices if necessary) X\ , X2, . ■ ■ , X s for some 3 < s < k such that 
Xi n Xi + \ mo d s = {vi} for each i, and the Vi are all distinct, then for each i we may choose neighbours 
U{ G Ai and yi in Ai + \ moc i s of Vi, and we may insist that all these 3s vertices are distinct. Applying 
Lemma 1131 to each G[-Aj] in turn and concatenating the resulting paths we can find a cycle T2V for every 
4s < 21' < \Ai\ + IA2I on which there are no quadruples using more than one vertex outside IJi^i- Again 
we may apply Lemma [T^] to T^y to obtain a squared cycle on 3£ vertices. Finally by performing parity 
corrections we obtain C\ for every £ e [3, |(|^4i| + I-A2D] \ {5}. 

If there exists no such cycle of sets, then Yli=i l-^il — n ~ |W| + fc— 1. Since we have also \Xi\ > 5— \W\ + 1 
for each i and \W\ < n — 5, it follows from the definition of r c (n, 5) (by establishing a relation similar to @) 
that k < r c (n, 5), and by averaging, the largest of the sets Xi, say Xi, contains at least 2 sc(n, 5)/3 vertices. 
As before, we can apply Lemma [T3l to X\ to discover a cycle T21' for each 4 < 2£' < \X\\ on which the 'bad' 
vertices are separated, and apply Lemma[T2]to it to obtain a squared cycle C^ e , for each 6 < 3£' < sc(n, 5) as 
required. Again the parity correction procedure is applicable, so we get G\ for every £ £ [3, sc(n, 5)] \ {5}. □ 

5 Concluding remarks 

The proof of Theorem |4j Our results were most difficult to prove for 5 « An/7 . This is somewhat 
surprising given the experience from the partial and perfect packing results of Komlos [S] and Kiihn and 
Osthus [T5] . In the setting of these results it becomes steadily more difficult to prove packing results as the 
minimum degree of the graph (and hence the required size of a packing) increases, with perfect packings 
as the most difficult case. Yet in our setting it is relatively easy to prove our results when the minimum 
degree condition is large. This difference occurs because we have to embed triangle-connected graphs; as 
the minimum degree increases the possibilities for bad behaviour when forming triangle-connections are 
reduced. This is related to the behaviour of K^-iree graphs: if S(G) > 2v(G)/3 then G is not -fQ-free; if 
5(G) > 5v(G)/8 then by the Andrasfai-Erd6s-S6s theorem [5] G is forced to be tripartite, while for smaller 
values of 5(G) there exist more possibilities. 

Extremal graphs. It is straightforward to check that up to some trivial modifications the graphs 
G p (n,5) and G c (n,5) are the only extremal graphs. However it is not the case that the only extremal 
graph excluding some C\ of chromatic number four is K n -&,n-&,2&-n- when 5 < 3??/5 there are several quite 
different extremal graphs. We believe that the graph G p (n, 6) remains extremal for squared paths even when 
5 is not bounded away from n/2, although as noted in Scction[T]thc same is not true for G c (n, 5) and squared 
cycles. 

Long squared cycles. In [1] a structural description of graphs avoiding non-trivial (unsquared) cycles 
of odd length was given. The corresponding result in our setting should be that G contains no non-trivial 
squared cycles of chromatic number four if it is possible to remove the vertices of an independent set from 
G to obtain a graph with no non-trivial odd cycles. 

In addition, Theorem [5 ] ( m ) | states that if any of various odd cycles are excluded from G we are guaranteed 
even cycles of every length up to 25(G), whereas the equivalent statement in our Theorem [5] contains an 
error term. We believe this error term can be removed, but at the cost of significantly more technical work 
with both the stability lemma and new extremal results. 

Higher powers of paths and cycles. We note that Theorem [2] has a natural generalisation to higher 
powers of cycles, the so called Posa-Seymour Conjecture; this was also proved for all sufficiently large n by 
Komlos, Sarkozy and Szemeredi [12] . We conjecture a natural generalisation of Thcorcm[?]for higher powers 
of paths and cycles. 

Given k, n and 5, we construct an n-vertex graph G^ (n, 5) by partitioning the vertices into an 'interior' 
set of £ = (k — l)(n — 5) vertices upon which we place a complete balanced k — 1-partite graph, and an 
'exterior' set of n — £ vertices upon which we place a disjoint union of [(n — £)/(5 — £ + 1)J almost-equal 
cliques. We then join every 'interior' vertex to every 'exterior' vertex. We construct Gc (n, 5) similarly, 
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permitting the cliques in the 'exterior' vertices to overlap in cut-vertices of the 'exterior' set if this reduces 
the size of the largest clique while preserving the minimum degree S. 

Conjecture 14. Given v > and k there exists n such that whenever n > n and G is an n-vertex graph 
with 5(G) = S > ^j^-n + vn, the following hold. 

(i) IfPf C G { p\n,5) then Pj? C G. 

(ii) If CK +1 j£ C Gi k \n,6) for some integer I, then CK +1 y C G. 

(Hi) If C\ C Gc (n,5) with x(C|) = k + 2 and C\ % G for some integer £, then C( fe fe+1 v Q G for each 
integer £ < kS — (k — l)n — vn. 

It seems likely that again the vn error term in the last statement is not required, but again (at least for 
powers of cycles) it is required in the minimum degree condition. 
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A Proof of Lemma \7\ 

For the proof of Lemma [7] we apply the following version of the Blow-up Lemma of Komlos, Sarkozy and 
Szemeredi |llj . 

Lemma 15 (Blow-up Lemma |llj). Given fixed c,d> 0, for any sufficiently small e > the following holds. 
Let H be any graph with V(H) = V1UV2UV3 and \Vi\ > g|V(£f)|, in which each bipartite graph H[Vi,Vj] is 
(2e,d) -regular and furthermore Svi(Vj) > ^d|Vi| for each 1 < i,j < 3. 

Let F be any subgraph of the complete tripartite graph with parts V\ , V2 and V3 such that the maximum 
degree of F is at most four. Assume further, that at most four vertices Xi of F are endowed with sets 
G Xi C Vj such that Xi G Vj and \C X \ > c\Vj\ 

Then there is an embedding ip : V(F) — » V(H) of F into H with tp(xi) G C Xi for i G [4]. 

We also say that the vertices x% in Lemma [15] arc image restricted to Cx t ■ 

Proof of Lemma^ Let G be an n- vertex graph, and R' an (e, <i)-reduced graph for it on m vertices. 

Fix a set T 1 = |t{, . . . j2qtf(.r')/3 } °^ ver tex-disjoint triangles in a triangle-component of R' covering 
CTF(iJ') vertices. For each triangle T- = X' tl X- 2 X[ 3 we may by regularity for each j G [3] remove at most 
e|X-j| vertices from X[^ to obtain a set X iy j such that each pair (Xij,X iy k) is not only (2e, d)-regular but 
also satisfies Sx itk (X i: j) > (d — 3e)|X ij fe|. We let R be the (2e, d)-reduced graph corresponding to the new 
vertex partition given by replacing each X- j with Xij; then every edge of R' carries over to R, and we let 
T be the corresponding set of CTF(i?')/3 vertex disjoint triangles in R. 

Fact 9. Let X\, . . . , A5 be vertices of R (not necessarily distinct), and Z be any set of at most e\X%\ vertices 
of G. Suppose that X3X4 and A3X5 are edges of R. Suppose furthermore that we have two vertices u 6 X\ 
and v G X2 such that uv is an edge of G, u and v have at least (d — e) 2 \X3\ common neighbours in X3, and 
v has at least (d — £)|^4| neighbours in A4. 

Then there is a vertex w G X3 — Z adjacent to u and v such that v and w have at least (d — e) 2 \X4\ 
common neighbours in X4 and w has at least (d — s)\Xq\ neighbours in X§. 

Proof. Let W be the set of common neighbours of u and v in X%. Since X^X^ G E(R), at most e\X$\ vertices 
of W have fewer than {d— e)\T(v)^Xi\ > (d— e) 2 |A4| common neighbours with v in X4. Since X3X5 G E(R) 
at most e\Xs\ vertices of W have fewer than (d — e) neighbours in X$. Finally since 3e|X3| < (d — e) 2 \X3\ 
we can find a vertex of W — Z satisfying the desired properties. □ 

Given a triangle walk W = {E\, . . .) in R and an orientation U{Vi of the first edge E\ we wish to find 
eventually a squared path in G following W, whose first two vertices are in Ui and V\, in that order. First 
we give a sequence of vertices of R which has the property that every vertex in the sequence is adjacent to 
the two preceding vertices (as is the case for a squared path). 

We construct this sequence of vertices of R iteratively as follows. Let Q\ = (Ui,Vi). Now for each 
2 < i < \W\ successively, we define Qi as follows. The last two vertices J7»_i, Vi-\ of Qi-\ are an orientation 
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of Ei-\. If Ei = Ui-iVi we create Qi by appending (Vi,Ui-x) to Qi-i; if Ej = Vi-iVi we append (Vi) to 
Qi-i to create Qi. At each step the final two vertices of Qi are an orientation of Ei; furthermore every 
vertex of Qi is adjacent in R to the two vertices preceding it in Qi. Finally, we let QiW, U\Vi) = Q\w\- 

For each 1 < i < CTF(P)/3 — 1 let Wi be a fixed triangle walk in R whose first edge is in Ti and whose 
last is in Tj+i. We suppose (repeating edges in the triangle walk Wi if necessary) that each triangle walk 
contains at least ten edges, and that each walk Wj with more than ten edges is of minimal length. We have 
\Wi\< (™) for each i. 

We prove first that G contains Cf^ for each U < (1 - d) CTF(R)n/m. When I < 3(1 - d)n/m we 
have C^g C K(i-d)n/m,(l-d)n/m,(i-d)n/m an d thus by Lemma [TBI we can find as a subgraph of G (whose 
vertices are in T\, with no restrictions required). Otherwise, let UV be the first edge of the triangle walk 
W\. Let W' be the triangle walk obtained by concatenating Wi, . . . , Wctf(r)/3-i and removing the last 
edge of Wi if it is identical to the first edge of Wi+i. 

We choose two adjacent vertices u and v of G in U and V respectively, such that u and v have (d— e) 2 n/m 
common neighbours in both the third vertex of T\ and the third vertex of Q{W' , UV), such that v has 
(d — e)n/m neighbours in the third vertex of Ti, and such that v has (e? — e)n/m neighbours in the third 
vertex of Q(W' , UV) (which is possible by regularity of the various pairs). Now we apply Fact [9] with the 
vertices u and v and the first five vertices of Q(W' , UV) to obtain a third vertex v' . Now by repeatedly 
applying Fact [9] we construct a sequence of vertices P' (starting with u,v), where the ith vertex of P' is in 
Q(W', UV), and the vertices are all distinct (noting that 3| W'\ < en/m). Thus P' is a squared path running 
from Ti to Tctf(_r)/3-i following all the triangle walks Wi- 

We construct similarly (and without re-using vertices) for each 1 < i < CTF(P)/3 — 1 a squared path Pi 
following the triangle walk Wj. However, we use the opposite orientation for the first edge: that is, instead of 
constructing Pj from Q{W\, UV) we use Q(W\, VU), and similarly for each Pi we use the opposite orientation 
of the first edge of Wj to that used in P' . We note that the total number of vertices on all of these squared 
paths is not more than 6m(™) < en/m. Finally, we remove from T\ all vertices of P'U P\ U ■•• UPcTF(i?)/3-i; 
it satisfies the conditions of Lemma [15] and thus we may embed a squared path S\ into T\ , with the four 
restrictions that its first vertex is a common neighbour of the first two vertices of P', its second a neighbour 
of the first vertex of P', its penultimate vertex a neighbour of the first vertex of Pi and its final vertex 
a common neighbour of the first two vertices of Pi (noting that by choice of the first two vertices of P' 
and of Pi the sets to which these vertices are restricted are indeed of size cn/m when c < d/A). This 
squared path may have 3fc + /i vertices, where f\ £ {0, 1, 2} is fixed (by the restrictions of the start and end 
vertices) and we may choose any integer k £ [10, (1 — d)n/m]. Similarly we may apply Lemma [TBI to each Tj, 
2 < i < CTF(P)/3, to obtain squared paths Si whose length we may (up to the similar restrictions) choose. 

Finally S = P' U S\ UPi U . . . U Pctf(h) /3-i U Sctf(r) /3 forms a squared cycle in G. It is not immediately 
obvious that the number of vertices of S is divisible by three — but note that 

\Q(W,UV)\ + \Q(W,VU)\ = 1 mod 3 

for any triangle walk W (with at least two edges) and first edge UV, by construction. It follows that indeed 
S = C^ k for some integer k, and we may choose any 3fc £ [6m 3 , (1 — d) CTF(P)n/m], as required. 

When every triangle-component of R contains K4 we must also obtain squared cycles whose lengths 
are not divisible by three. Observe that if ABCD is a copy of K4 in R, then the vertex sequences ABC, 
ABC D ABC and ABCD ABCD ABC each start and end with the same pair and (by use of Fact [9]) can 
be used to construct squared paths in G which take any of the three possible lengths modulo three. We 
construct C\ for I £ [3, 20] \ {5} within a copy of K4 in R directly (by the above methods). To obtain Cf 
with 21 < £ < 3(1 — d)n/m we remove at most 2en/m vertices from each of A, B and C to obtain a triangle 
satisfying the conditions of Lemma [TBI construct a short path following the appropriate vertex sequence for 
£ mod 3 and apply Lemma [TBI to obtain Cf. Finally, to obtain longer squared cycles we perform the same 
construction as above, with the exception that W' is any triangle walk to and from a copy of K4, and so 
Q(W' , UV) may be taken (using one of the three vertex sequences above) to have any desired number of 
vertices modulo three (and not more than 6m 2 in total) . 

Lastly, when we are required to construct a squared path between two specified edges u\V\ (with 2dn/m 
common neighbours in both X\ and Y\) and U2V2 (with 2dn/m common neighbours in both X2 and Y2) 
using triangles T in R, we apply the identical strategy, noting that the conditions on mvi and U2V2 are 
already suitable for application of Fact [H □ 
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